Prior preference learning from experts: Designing a reward with active inference

نویسندگان

چکیده

Active inference may be defined as Bayesian modeling of a brain with biologically plausible model the agent. Its primary idea relies on free energy principle and prior preference An agent will choose an action that leads to its for future observation. In this paper, we claim active can interpreted using reinforcement learning (RL) algorithms find theoretical connection between them. We extend concept expected (EFE), which is core quantity in inference, EFE treated negative value function. Motivated by connection, propose simple but novel method from experts. This illustrates problem inverse RL approached new perspective inference. Experimental results show possibility EFE-based rewards application problem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Active Preference-Based Learning of Reward Functions

Our goal is to efficiently learn reward functions encoding a human’s preferences for how a dynamical system should act. There are two challenges with this. First, in many problems it is difficult for people to provide demonstrations of the desired system trajectory (like a high-DOF robot arm motion or an aggressive driving maneuver), or to even assign how much numerical reward an action or traj...

متن کامل

Dopamine, reward learning, and active inference

Temporal difference learning models propose phasic dopamine signaling encodes reward prediction errors that drive learning. This is supported by studies where optogenetic stimulation of dopamine neurons can stand in lieu of actual reward. Nevertheless, a large body of data also shows that dopamine is not necessary for learning, and that dopamine depletion primarily affects task performance. We ...

متن کامل

Active Reward Learning from Critiques

Learning from demonstration algorithms, such as Inverse Reinforcement Learning, aim to provide a natural mechanism for programming robots, but can often require a prohibitive number of demonstrations to capture important subtleties of a task. Rather than requesting additional demonstrations blindly, active learning methods leverage uncertainty to query the user for action labels at states with ...

متن کامل

Active learning with a misspecified prior

We study learning and information acquisition by a Bayesian agent whose prior belief is misspecified in the sense that it assigns probability zero to the true state of the world. At each instant, the agent takes an action and observes the corresponding payoff, which is the sum of a fixed but unknown function of the action and an additive error term. We provide a complete characterization of asy...

متن کامل

Preference Inference through Rescaling Preference Learning

One approach to preference learning, based on linear support vector machines, involves choosing a weight vector whose associated hyperplane has maximum margin with respect to an input set of preference vectors, and using this to compare feature vectors. However, as is well known, the result can be sensitive to how each feature is scaled, so that rescaling can lead to an essentially different ve...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Neurocomputing

سال: 2022

ISSN: ['0925-2312', '1872-8286']

DOI: https://doi.org/10.1016/j.neucom.2021.12.042